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BC029520 1564 bp mRNA linear PRI 28-JUL-2005 

Homo sapiens WD repeat, SAM and U-box domain containing 1, mRNA 
(cDNA clone MGC: 33855 IMAGE : 5301559 ) , complete cds . 
BC029520 

BC029520. 1 GI: 20810486 
MGC. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Euarchontoglires; Primates; Catarrhini; 
Hominidae; Homo. 

1 (bases 1 to 1564) 

Strausberg, R. L. , Feingold, E .A. , Grouse, L.H. , Derge, J.G., 
Klausner,R.D. , Collins, F. S . , Wagner, L., Shenmen, CM. , Schuler, G. D . , 
Altschul, S . F. , Zeeberg,B., Buetow,K.H., Schaef er , C . F . , Bhat,N.K., 
Hopkins, R. F. , Jordan, H., Moore, T., Max,S.I., Wang, J., Hsieh,F., 
Diatchenko, L. , Marusina,K., Farmer, A. A., Rubin, G.M., Hong,L., 
Stapleton,M. , Soares,M.B. , Bonaldo,M.F. , Casavant , T . L . , 
Scheetz,T.E. , Brownstein, M. J . , Usdin,T.B. , Toshiyuki, S. , 
Carninci,P., Prange,C, Raha,S.S., Loquellano, N. A. , Peters, G. J., 
Abramson, R . D. , Mullahy, S . J. , Bosak, S .A. , McEwan, P. J. , 
McKernan, K. J. , Malek,J.A., Gunaratne, P . H . , Richards, S., 
Worley,K.C, Hale,S., Garcia, A.M., Gay,L.J., Hulyk,S.W., 
Villalon, D.K. , Muzny, D .M. , Sodergren, E . J . , Lu,X., Gibbs,R.A., 
Fahey,J., Helton, E., Ketteman,M., Madan,A., Rodrigues, S . , 
Sanchez, A., Whiting, M., Madan,A., Young, A. C, Shevchenko, Y . , 
Bouf fard, G. G. , Blakesley, R.W. , Touchman, J.W. , Green, E . D. , 
Dickson, M.C. , Rodriguez, A. C . , Grimwood,J., Schmutz,J., Myers, R.M., 
Butterf ield, Y . S . , Krzywinski, M.I. , Skalska,U. , Smailus,D.E. , 
Schnerch,A., Schein,J.E., Jones, S.J. and Marra,M.A. 
Mammalian Gene Collection Program Team 

Generation and initial analysis of more than 15,000 full-length 
human and mouse cDNA sequences 

Proc. Natl. Acad. Sci. U.S.A. 99 (26), 16899-16903 (2002) 
12477932 

2 (bases 1 to 1564) 

NIH MGC Project 
Direct Submission 

Submitted ( 01-MAY-2002 ) National Institutes of Health, Mammalian 
Gene Collection (MGC) , Bethesda, MD 20892-2590, USA 
NIH-MGC Project URL: http : //mqc . nci . nih . gov 
Contact: MGC help desk 
Email : cgapbs-r@mail . nih . gov 

Tissue Procurement: Miklos Palkovits, M.D., Ph.D. 

cDNA Library Preparation: Michael J. Brownstein (NHGRI) & Shiraki 
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Toshiyuki and Piero Carninci (RIKEN) 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Sequencing Group at the Stanford Human Genome 

Center, Stanford University School of Medicine, Stanford, CA 94305 

Web site: http : //www-shqc . Stanford . edu 

Contact: (Dickson, Mark) mcd@paxil.stanford.edu 

Dickson, M., Schmutz, J., Grimwood, J., Rodriquez, A., and Myers, 
R. M. 



Clone distribution: MGC clone distribution information can be found 
through the I.M.A.G.E. Consortium/ LLNL at: http : //image . llnl . gov 
Series: IRAK Plate: 48 Row: o Column: 11 

This clone was selected for full length sequencing because it 
passed the following selection criteria: matched mRNA gi : 22749102. 
FEATURES Location/Qualifiers 
source 1 . . 1564 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db xref="taxon: 96Q6 " 
/clone="MGC: 33855 IMAGE : 5301559" 
/tissue_type= "Brain, hypothalamus" 
/clone_lib="NIH_MGC_96" 
/lab_host="DH10B" 
/note="Vector : pBluescriptR" 
gene 1 . . 1564 

/gene="WDSUBl" 

/note="synonyms: FLJ36175, UBOX6" 
/ db_x r e f = " Gen e I D : 151525 " 
CDS 146.. 1300 

/gene="WDSUBl" 
/codon_start=l 
/product="WDSUBl protein" 
/protein id=" AAH29520 . 1 " 
/db_xref="GI : 20810487" 
/ db_x r e f = " Gen e I D : 151525 " 

/ translation="MVKLIHTLADHGDDVNCCAFSFSLLATCSLDKTIRLYSLRDFTE 
LPHSPLKFHTYAVHCCCFSPSGHILASCSTDGTTVLWNTENGQMLAVMEQPSGS PVRV 
CQFSPDSTCLASGAADGTWLWNAQSYKLYRCGSVKDGSLAACAFSPNGSFFVTGSSC 
GDLTVWDDKMRCLHSEKAHDLGITCCDFSSQPVSDGEQGLQFFRLASCGQDCQVKIWI 
VSFTDILARRTEHQLKQFTEDWSEEDVSTWLCAQDLKDLVGI FKMNNI DGKE LLNLTK 
E S LADDLKI ES LGLRS KVLRKI EE LRTKVKS LS S GI PDEFICPITRELMKDPVIASDG 
YSYEKEAMENWISKKKRTSPMTNLVLPSAVLTPNRTLKMAINRWLETHQK" 
misc difference 663 

/gene="WDSUBl" 

/note^'T' in cDNA is ' C in the human genome; amino acid 
difference: 1 L 1 in cDNA, f P' in the human genome. The 
chimpanzee genome agrees with the cDNA sequence, 
suggesting that this difference is unlikely to be due to 
an artifact; Differences found between this sequence and 
the human reference genome (build 35) are described in 
misc_dif ference features below and these differences were 
also compared to chimpanzee genomic seqeunces available as 
of 09/15/2004 00:00:00" 
misc difference 812 

/gene="WDSUBl" 

/note^'G* in cDNA is ' C in the human genome; amino acid 
difference: 1 D 1 in cDNA, 1 H 1 in the human genome. The 
chimpanzee genome agrees with the human genomic sequence 
and not the cDNA; Differences found between this sequence 
and the human reference genome (build 35) are described in 
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misc_dif f erence features below and these differences were 
also compared to chimpanzee genomic seqeunces available as 
of 09/15/2004 00:00:00" 
misc difference 1536.. 1564 

/gene^'WDSUBl" 

/note= n polyA tail: 29 bases do not align to the human 
genome; Differences found between this sequence and the 
human reference genome (build 35) are described in 
misc_diff erence features below and these differences were 
also compared to chimpanzee genomic seqeunces available as 
of 09/15/2004 00:00:00" 

ORIGIN 

1 ctgttccctc tgctctgggt ctccgccggc gcccgccccg ccagcctcac ctgcgcggca 
61 cgtgacccgc accgcccgtg ggcaccttga aggcggatcc cgcgcgcccc cgctcctgca 
121 ggctgttttt cttcaaataa agaacatggt gaaactgatt cacacattag ctgatcatgg 
181 tgacgatgtc aactgctgtg ccttctcctt ttccctcttg gctacttgct ccttggacaa 
241 aacaattcgc ctgtactcgt tacgtgactt tactgaactg ccacattctc cattgaagtt 
301 tcatacctat gctgtccact gctgctgttt ctccccttca ggacatattt tggcatcgtg 
361 ttcaacagat ggtaccactg tcctatggaa tactgaaaat ggacagatgc tggcagtgat 
421 ggaacagcct agtggcagcc ctgtgagggt ttgccagttt tccccagact ccacgtgttt 
481 ggcatcaggg gcagctgatg gaactgtggt tttgtggaat gcacagtcat acaaattata 
541 tagatgtggt agtgttaaag atggctcctt ggcggcatgt gcattttctc ctaatggaag 
601 cttctttgtc actggctcct catgtggtga tttaacagtg tgggatgata aaatgaggtg 
661 tctgcatagt gaaaaagcac atgatcttgg aattacctgc tgcgattttt cttcacagcc 
721 agtttctgat ggagaacaag gtcttcagtt ttttcgactg gcatcatgtg gtcaggattg 
781 ccaagtcaaa atttggattg tttcttttac cgatatctta gcaaggcgca cagaacatca 
841 gctgaagcaa tttaccgaag attggtcaga ggaggatgtc tcaacatggc tttgtgcaca 
901 agatttaaaa gatcttgttg gtattttcaa gatgaataac attgatggaa aagaactgtt 
961 gaatcttaca aaagaaagtc tggctgatga tttgaaaatt gaatctctag gactgcgtag 
1021 taaagtgctg aggaaaattg aagagctcag gaccaaggtt aaatcccttt cttcaggaat 
1081 tcctgatgaa tttatatgtc caataactag agaacttatg aaagatccgg tcatcgcatc 
1141 agatggctat tcatatgaaa aggaagcaat ggaaaattgg atcagcaaaa agaaacgtac 
1201 aagtcccatg acaaatcttg ttcttccttc agcggtactt acaccaaata ggactctgaa 
1261 aatggccatc aatagatggc tggagacaca ccaaaagtaa aattgttgat attgtattat 
1321 ttatattttc agtgatctca tttgaatgat ttataggtaa atactaatca gacattatta 
1381 aaagcaaaac aggaaaaagg taaacttctt aaatttagtt acctataaaa attgtcaatt 
1441 ttcattcttt aaaaaacaca tggacttact ataaaagcct ttttgtacta gtgaaaagaa 
1501 tcttcagcta tatagaaata aagttatact ttaaaaaaaa aaaaaaagaa aaaaaaaaaa 
1561 aaaa 
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